This page last changed on Dec 18, 2007 by aaime.

How a Get Feature Requests Works

This page will describe how a getFeature() request works in Geoserver. It will also show you how to set up Eclipse and use the debugger to run through a getFeature() request yourself.

There are three bodies to a getFeature() request:

  1. DataSet
  2. Request
  3. Response

Try it in Eclipse

You can walk through this tutorial using the debugger by following these steps:

  1. Make sure your Eclipse is set up to run the debugger. If not, refer to this page
  2. Put a break point in the method 'doGet()' in WfsDispatcher
  3. Run Geoserver in Debug mode: Run -> Debug... -> select your debug task -> hit 'Debug'
  4. Open up your favorite web browser and click this link

DataSet

A dataset is the data that will be read by the getFeature() request. It is what makes up our maps.

Here is an example of a PostGIS database (or a shapefile) with the following data in it:

tiger_ny=# select state_name,state_abbr,persons::int4,families::int4,houshold::int4,male::int4,
           female::int4, geometrytype(the_geom) as the_geom from states_postgis;
  state_name   | state_abbr | persons  | families | houshold |  male   | female  |   the_geom
---------------+------------+----------+----------+----------+---------+---------+--------------
 Iowa          | IA         |  2776755 |   740819 |  1064325 | 1344802 | 1431953 | MULTIPOLYGON ...
 Massachusetts | MA         |  6016425 |  1514746 |  2247110 | 2888745 | 3127680 | MULTIPOLYGON ...
 Nebraska      | NE         |  1578385 |   415427 |   602363 |  769439 |  808946 | MULTIPOLYGON ...
 New York      | NY         | 18235907 |  4548344 |  6746555 | 8739138 | 9496769 | MULTIPOLYGON ...

 Pennsylvania  | PA         | 11881643 |  3155989 |  4495966 | 5694265 | 6187378 | MULTIPOLYGON ...
 Connecticut   | CT         |  3287116 |   864493 |  1230479 | 1592873 | 1694243 | MULTIPOLYGON ...
 Rhode Island  | RI         |  1003464 |   258886 |   377977 |  481496 |  521968 | MULTIPOLYGON ...
 New Jersey    | NJ         |  7484736 |  1962314 |  2687478 | 3622220 | 3862516 | MULTIPOLYGON ...
 Indiana       | IN         |  5544159 |  1480351 |  2065355 | 2688281 | 2855878 | MULTIPOLYGON ...
 Nevada        | NV         |  1201833 |   307400 |   466297 |  611880 |  589953 | MULTIPOLYGON ...
...
In Reality

Geoserver ships with a USA states shapefile that has all the above data in it. It also contains more columns than the above, but I've simplified it for easier reading. You can put the Geoserver "states" shapefile in PostGIS, using the shp2pgsql program that comes with PostGIS, and use a PostGIS datastore instead.

NOTE: This discussion is pretty much the same for any datastore (ie. with oracle, shapefiles, DB2, SDE, etc...).

In the above example of the data, a Feature is one of the rows. For example

Iowa          | IA         |  2776755 |   740819 |  1064325 | 1344802 | 1431953 | MULTIPOLYGON ...

is a Feature.

A getFeature request will be targetted at the data and will bring back Features in the form of GML.

Request

A request can be sent to Geoserver as a GET or a POST, both are handled similarly.

The getFeature process keeps the distinction between a GET and POST until it hits the FeatureRequest object: org.vfny.geoserver.wfs.requests.FeatureRequest. Once you hit FeatureRequest, the code isn't forked and the request works from one spot, execute(). Read on for more details.

GET and POST

Here is an example HTTP GET request

http://localhost:8080/geoserver/wfs?
request=getfeature&
service=wfs&
version=1.0.0&
typename=states&
filter=<ogc:Filter
xmlns:ogc="http://ogc.org" xmlns:gml="http://www.opengis.net/gml">
<ogc:BBOX>
<ogc:PropertyName>the_geom</ogc:PropertyName>
<gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml">
<gml:coordinates>-73.99312376470733,40.76203427979042 -73.9239210030026,40.80129519821393</gml:coordinates>
</gml:Box>
</ogc:BBOX>
</ogc:Filter>
Try it

If you have Geoserver set up locally on port 8080, you can enter the above URL and Geoserver will process it.

Try it with this link

 

Here is an example HTTP XML POST request

http://localhost:8080/geoserver/wfs

<wfs:GetFeature service="WFS" version="1.0.0"
  outputFormat="GML2"
  xmlns:topp="http://www.openplans.org/topp"
  xmlns:wfs="http://www.opengis.net/wfs"
  xmlns:ogc="http://www.opengis.net/ogc"
  xmlns:gml="http://www.opengis.net/gml"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.opengis.net/wfs
                      http://schemas.opengis.net/wfs/1.0.0/WFS-basic.xsd">
  <wfs:Query typeName="states">
    <ogc:Filter>
      <ogc:BBOX>
        <ogc:PropertyName>the_geom</ogc:PropertyName>
        <gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml#4326">
           <gml:coordinates>
               -73.99312376470733,40.76203427979042 -73.9239210030026,40.80129519821393
           </gml:coordinates>
        </gml:Box>
      </ogc:BBOX>
   </ogc:Filter>
  </wfs:Query>
</wfs:GetFeature>

Exploring the HTTP GET request URL

There are 6 parts to the URL:

  1. The server address - _http://localhost:8080/geoserver/wfs_
  2. The request type - request=getfeature
  3. The service type - service=wfs
  4. The version - version=1.0.0
  5. The type name, also known as the data you are querying - typename=states
  6. The filter used to select exactly what you want from the type

The server address points to where your Geoserver instance is running. In this example, on the local machine on port 8080.

The request type is the command that you are sending to the server. In this case the URL is asking "get me some features". There are other commands that can be sent:

  • GetFeature (the case we are analyzing)
  • Transaction
  • LockFeature
  • GetFeatureWithLock
  • GetFeatureInfo
  • GetCapabilities

Service type tells the server what service mode you want. Here we want WFS. Another possible service is WMS.

Version number of the WFS specification that is used (1.0.0).

The type name is the FeatureType that you are querying, also known as the data. In our example, a shapefile that contains US states.

The filter is a restriction on our query. It pretty much says "restrict my query to only features in this bounding box". There are many filters you can use, but we will not explore them in this tutorial. Here is the sleep inducing OGC Filter specification if you really want to learn more.

How Geoserver interperets the request

Here is the overview of the program flow, in a bit of an abstract view.

Entry Point

When the request comes in, the servlet container (ie. jetty or tomcat) will send the request to the WfsDispatcher. This is the entry point for Geoserver to process the results.

You can set up where this entry point is by changing your web.xml file. Located in %GEOSERVER_HOME%/server/geoserver/WEB-INF

Here is the part of the xml file that we want:

<servlet-name>WfsDispatcher</servlet-name>
    <servlet-class>org.vfny.geoserver.wfs.servlets.WfsDispatcher</servlet-class>
  </servlet>
...
 <servlet-mapping>
    <servlet-name>WfsDispatcher</servlet-name>
    <url-pattern>/wfs/*</url-pattern>
  </servlet-mapping>

This says that any request to "wfs/*" will get routed to the org.vfny.geoserver.wfs.servlets.WfsDispatcher servlet. Since both our requests (GET and POST) are to "http://localhost:8080/geoserver/wfs", the servlet container (ie. jetty or tomcat) will send the request to the WfsDispatcher.

WFS Dispatcher

There are two main methods in WfsDispatcher.java (located in org.vfny.geoserver.wfs.servlets):

public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

If you send an HTTP POST request, doPost gets called. If you send an HTTP GET request, doGet gets called. Get it?

POST
Since one cannot read the POST portion of a HTTP request more than once, a copy of the request is written to disk. This is done in WfsDispatcher.doPost().

NOTE: This is somewhat inefficient, but we originally done this way because very large feature insert requests can be very large. Holding the request in memory would not be scalable. A better (and faster) solution would be to have a simple class that would either hold a small request in memory or, if the request is large, write it to disk.

DispatcherXMLReader will then use SAX to parse the XML. It looks at the first tag in the XML request (see DispatchHandler) and can determine the request type from that. In our case (see above), its "<wfs:GetFeature>" so we know this is a Dispatcher.GET_FEATURE_REQUEST request.

GET
This case is very easy - it just looks at the request url for the "request=GetFeature". You can see that in WfsDispatcher.doGet() and DispatcherKvpReader.

NOTE: "Kvp" means "Key-Value Pair". For the clause "request=GetFeature", the Key is "request" and the value is "GetFeature".

Whether it was an HTTP GET or POST, the WfsDispatcher will create an appropriate servelet. Remember the 6 different request types: GetFeature, Transaction, LockFeature, GetFeatureWithLock, GetFeatureInfo, GetCapabilities? For 'GetFeature' it will create a Feature servelet: org.vfny.geoserver.wfs.servlets.Feature

The GET or POST request information is then passed to that servelet, along with a response object that the servelet will populate. The response will be described later in the response section.

Note

You might be wondering why a request goes through the (WFS) Dispatcher and then through a second (GetFeature) servlet.

If you look at the web.xml (the file the servlet container uses for configuration), you'll see these lines:

<servlet>
    <servlet-name>GetFeature</servlet-name>
    <servlet-class>org.vfny.geoserver.wfs.servlets.Feature</servlet-class>
  </servlet>

  <servlet-mapping>
    <servlet-name>GetFeature</servlet-name>
    <url-pattern>/wfs/GetFeature/*</url-pattern>
  </servlet-mapping>

This means you can actually send your GetFeature request directly to the GetFeature servlet instead indirectly through the WFS dispatcher. Most people (and software) find it easier to just send all your requests to one URL instead of each request to a specific servlet. Geoserver lets you do either.

http://localhost:8080/geoserver/wfs/GetFeature

The diagram below shows where the distinction between GET and POST ends:

Not every object or class in the above diagram recognizes the difference, web.xml for example, but it is more to show the life of the GET and POST distinction.

Feature Servelet

The next stage of the getFeature request creates a FeatureRequest object and populates it with the query information.

Depending on whether a GET or a POST was used, different parsers are selected.

The two parsers are GetFeatureKvpReader, for HTTP GET, and GetFeatureXMLReader, for HTTP POST. These will then create the FeatureRequest object.

The FeatureRequest object will then head over to the feature type that was specified in the URL, in this example "states", and query the data.

Note

The feature request object does not actually get the features from the data store. It is important to note this. Performing a query actually gets you a FeatureReader object, and no real data. This is explained below in the Response section.

Response

The Response is the processing of what is sent back to the user after their request. The format of this rsponse, for a getFeature request, is GML.

Here is an overview of the process in picture format:

Where it Starts

At the very beginning, in WfsDispatcher, an HttpServeletResponse is passed into doGet() and doPost()

public void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException

Lets refer back to this diagram:

This response object is passed into the Feature servelet, so it can be populated once it has a hold of a FeatureReader.

Output Strategy Object

The output strategy object tells Geoserver how to proceede when returning the data. What does this mean? Here are some examples that are specified in the web.xml file to help explain it:

<context-param>
    <param-name>serviceStratagy</param-name>
    <!-- Meaning of the different values :
         BUFFER
         - stores the entire response in memory first, before sending it off to
           the user (may run out of memory)

         SPEED
         - outputs directly to the response (and cannot recover in the case of an
           error)

         FILE
         - outputs to the local filesystem first, before sending it off to the user
      -->
    <param-value>SPEED</param-value>
  </context-param>

Why is it called Strategy? A Strategy is a design pattern that is defined in the Gang of Four Design Patterns book (ISBN 0201633612). What it essentially does, is allow the user to plug in their own method of performing a specific task. So what Geoserver does is read the web.xml file, see what strategy you want to use ('speed' in our example), and plug it into the output response process.

You can define your own output strategy object by looking in org.vfny.geoserver.servlets.AbstractService. It must implement org.vfny.geoserver.servlets.AbstractService.ServiceStrategy

Here is a tutorial on setting up your own output strategy.

Feature Streaming

Its very important to note that the Geoserver/Geotools design allows for Feature Streaming, meaning that Geoserver only ever has around one feature in memory at a time. This is very important for large queries (or you'd run out of memory) as well as simutaneously doing multiple queries.

When a DataStore accepts a Query, it doesnt actually return Features, instead it returns a FeatureReader which can be used to read the Feature that the Query selects one-at-a-time. The delegate (ie. GML2 producer in our example) reads a single feature, converts it to GML2 and send the results off to the output Strategy object.

GML Encoding

After the output strategy has been determined, the Feature sends the output to a FeatureResponse object. This feature response object will then pass on the information to the GML2FeatureResponseDelegate object.

The GML encoding object will take care of the rest of the output for you that will be streamed through the output strategy.

The End

Notes

How Datastores process Query & Filter

Some DataStores (like the Database backed ones) can do most of the Filter processing in the database using the database's indexes. Other datastores can do "quick" processing of certain components of the Filter. For example, the "normal shapefile" datastore can quickly do bounding-box tests for features. The "index shapefile" datastore (thats a shapefile with a .qix spatial index file) can do spatial searching quickly.

Basically, the Filter object is sent to the datastore which looks at the Filter and beaks it into components:

  1. portions that the datastore can index (ie. quickly approximate a solution)
  2. portions that the datastore can efficiently calculate (ie. quickly calculate a solution)
  3. portions that the datastore cannot efficiently calculate (these will be handled by Geotools Java code)

All this is handled transparently by the datastore so the programmer just has to send a Query object off to the DataStore and not have to worry about how it processes it. The features returned by the FeatureReader will only be ones that pass the Filter conditions.

Lets looks at a PostGIS example for a Query's Filter that looks like this:

(the_geom INTERSECTS <a polygon>) AND (population > 1000000)

This will be translated into the SQL query:

SELECT ... FROM <table> WHERE
     the_geom && <bounding box of the search polygon> -- this is the spatial index operation
 AND intersects(the_geom, <the polygon> )      -- this is the full OGC spatial operation
 AND population > 1000000;

The same query to the "normal shapefile" datastore will be processed differently. The shapefile datastore can perform bounding-box vs bounding-box operations quickly because the shapefile has the bounding of each geometry stored.

The read processing is done in two states: (a) shapefile optimized and (b) Java-code handled.

foreach row in the shapefile
     If the row's bounding box overlaps the <bounding box of the search polygon>
       THEN send this row to the next stage
       OTHERWISE this feature does not pass the Filter condition

Java code will then take the "approximate" solution that the datastore can quickly compute and fully evaluate the Filter.

The same query to the "indexed shapefile" datastore can be processed even more effiently. Instead of having to read large portions of the shapefile to test EVERY row to see if the bounding box intersects the search bounding box, it can just read a portion of the spatial index. Java code will then take this "approximate" solution that the datastore can very quickly compute and fully evaluate the Filter.


feature_execute.gif (image/gif)
featurerequest.gif (image/gif)
get-post-distinction.gif (image/gif)
request-process.gif (image/gif)
request-response.gif (image/gif)
url_requestexample.gif (image/gif)
wfsdispatcher.gif (image/gif)
wfsdispatcher_response.gif (image/gif)
response.gif (image/gif)
Document generated by Confluence on Jan 16, 2008 23:26